As natural language processing (NLP) for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques such as large-scale language models suffer from data inadequacy and biased corpus, especially for languages with insufficient resources such as Chinese. To this end, we propose a Chinese cOrpus foR Gender bIas Probing and Mitigation CORGI-PM, which contains 32.9k sentences with high-quality labels derived by following an annotation scheme specifically developed for gender bias in the Chinese context. Moreover, we address three challenges for automatic textual gender bias mitigation, which requires the models to detect, classify, and mitigate textual gender bias. We also conduct experiments with state-of-the-art language models to provide baselines. To our best knowledge, CORGI-PM is the first sentence-level Chinese corpus for gender bias probing and mitigation.
translated by 谷歌翻译
The deep learning community has witnessed an exponentially growing interest in self-supervised learning (SSL). However, it still remains unexplored how to build a framework for learning useful representations of raw music waveforms in a self-supervised manner. In this work, we design Music2Vec, a framework exploring different SSL algorithmic components and tricks for music audio recordings. Our model achieves comparable results to the state-of-the-art (SOTA) music SSL model Jukebox, despite being significantly smaller with less than 2% of parameters of the latter. The model will be released on Huggingface(Please refer to: https://huggingface.co/m-a-p/music2vec-v1)
translated by 谷歌翻译
Fairness has become a trending topic in natural language processing (NLP), which addresses biases targeting certain social groups such as genders and religions. However, regional bias in language models (LMs), a long-standing global discrimination problem, still remains unexplored. This paper bridges the gap by analysing the regional bias learned by the pre-trained language models that are broadly used in NLP tasks. In addition to verifying the existence of regional bias in LMs, we find that the biases on regional groups can be strongly influenced by the geographical clustering of the groups. We accordingly propose a HiErarchical Regional Bias evaluation method (HERB) utilising the information from the sub-region clusters to quantify the bias in pre-trained LMs. Experiments show that our hierarchical metric can effectively evaluate the regional bias with respect to comprehensive topics and measure the potential regional bias that can be propagated to downstream tasks. Our codes are available at https://github.com/Bernard-Yang/HERB.
translated by 谷歌翻译
本文回顾了AIM 2022上压缩图像和视频超级分辨率的挑战。这项挑战包括两条曲目。轨道1的目标是压缩图像的超分辨率,轨迹〜2靶向压缩视频的超分辨率。在轨道1中,我们使用流行的数据集DIV2K作为培训,验证和测试集。在轨道2中,我们提出了LDV 3.0数据集,其中包含365个视频,包括LDV 2.0数据集(335个视频)和30个其他视频。在这一挑战中,有12支球队和2支球队分别提交了赛道1和赛道2的最终结果。所提出的方法和解决方案衡量了压缩图像和视频上超分辨率的最先进。提出的LDV 3.0数据集可在https://github.com/renyang-home/ldv_dataset上找到。此挑战的首页是在https://github.com/renyang-home/aim22_compresssr。
translated by 谷歌翻译
随着现代世界中对高度安全和可靠的轻质系统的需求增加,物理上无统治的功能(PUF)继续承诺可轻巧的高成本加密技术和安全钥匙存储。虽然PUF承诺的安全功能对安全系统设计师具有很高的吸引力,但已证明它们容易受到各种复杂攻击的攻击 - 最著名的是基于机器的建模攻击(ML -MA),这些攻击(ML -MA)试图以数字方式克隆PUF行为因此破坏了他们的安全。最新的ML-MA甚至还利用了PUF误差校正所需的公开辅助数据,以预测PUF响应而无需了解响应数据。为此,与传统的PUF储存技术和比较的PUF技术相反,研究开始研究PUF设备的身份验证,并进行了著名的挑战 - 响应对(CRP)的比较。在本文中,我们基于新颖的“ PUF - 表型”概念提出了一个使用ML的分类系统,以准确识别起点并确定得出的噪声记忆(DRAM)PUF响应的有效性作为助手数据依赖数据的Denoisis技术的替代方法。据我们所知,我们是第一个每个模型对多个设备进行分类的人,以实现基于组的PUF身份验证方案。我们使用修改后的深卷积神经网络(CNN)最多达到98 \%的分类精度,并与几个完善的分类器结合使用特征提取。我们还在实验中验证了在Raspberry Pi设备上模型的性能,以确定在资源约束环境中部署我们所提出的模型的适用性。
translated by 谷歌翻译
名义隐喻经常用于人类语言,并已被证明可以说服,表达情感和刺激兴趣。本文解决了中国名义上隐喻(NM)一代的问题。我们引入了一个新颖的多任务框架,该框架共同优化了三个任务:NM识别,NM组件识别和NM生成。隐喻识别模块能够执行自我训练程序,该程序从大型未标记的语料库中发现了新的隐喻,以进行NM生成。 NM组件识别模块在训练和条件下强调了这些NM组件上的成分,以获得更连贯的结果。为了训练NM识别和组件识别模块,我们构建了一个注释的语料库,该语料由6.3K句子组成,其中包含多种隐喻模式。自动指标表明,我们的方法可以产生具有良好可读性的多种隐喻,其中92%是新颖的隐喻比较。人类评估表明,我们的模型在一致性和创造力方面显着优于基准。
translated by 谷歌翻译
跨语言嵌入技术(CLWE)的技术在应对低资源语言的自然语言处理挑战方面起着基本作用。它的主要方法假设嵌入之间的关系可以由线性映射表示,但是没有探索该假设所存在的条件。这种研究差距最近变得非常危急,因为已经证明,放松映射是非线性的,在某些情况下可以提高性能。我们首次提出了一个理论分析,该分析将单词嵌入中编码的类比保存是一种必要且充分的条件,用于在这些嵌入之间的地面clwe映射是线性的。在一个涵盖十二种不同语言的五个代表性类比类别的新型跨语性类比数据集中,我们进行了实验,为我们的理论主张提供直接的经验支持。这些结果提供了对其他研究人员的观察结果的更多见解,并为制定更有效的跨语性代表性学习策略做出了贡献。
translated by 谷歌翻译
Gradient-based explanation is the cornerstone of explainable deep networks, but it has been shown to be vulnerable to adversarial attacks. However, existing works measure the explanation robustness based on $\ell_p$-norm, which can be counter-intuitive to humans, who only pay attention to the top few salient features. We propose explanation ranking thickness as a more suitable explanation robustness metric. We then present a new practical adversarial attacking goal for manipulating explanation rankings. To mitigate the ranking-based attacks while maintaining computational feasibility, we derive surrogate bounds of the thickness that involve expensive sampling and integration. We use a multi-objective approach to analyze the convergence of a gradient-based attack to confirm that the explanation robustness can be measured by the thickness metric. We conduct experiments on various network architectures and diverse datasets to prove the superiority of the proposed methods, while the widely accepted Hessian-based curvature smoothing approaches are not as robust as our method.
translated by 谷歌翻译
Video super-resolution is one of the most popular tasks on mobile devices, being widely used for an automatic improvement of low-bitrate and low-resolution video streams. While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices. In this Mobile AI challenge, we address this problem and propose the participants to design an end-to-end real-time video super-resolution solution for mobile NPUs optimized for low energy consumption. The participants were provided with the REDS training dataset containing video sequences for a 4X video upscaling task. The runtime and power efficiency of all models was evaluated on the powerful MediaTek Dimensity 9000 platform with a dedicated AI processing unit capable of accelerating floating-point and quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 500 FPS rate and 0.2 [Watt / 30 FPS] power consumption. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译